Rank | Count | Beginning |
---|---|---|
66312 | 14451 | The |
23654 | 3776 | He |
37871 | 2632 | It |
33952 | 2523 | In |
30689 | 2149 | I |
85109 | 2082 | This |
90995 | 2050 | We |
646 | 1471 | A |
98104 | 1461 | You |
90997 | 1415 | “We |
66318 | 1169 | “The |
60889 | 1103 | She |
10833 | 1029 | But |
30687 | 998 | “I |
83866 | 957 | They |
49227 | 943 | Mr |
78248 | 795 | There |
890 | 778 | According |
6735 | 737 | As |
31813 | 693 | If |
28609 | 689 | HomeNewsHeadline |
29669 | 662 | However, |
52624 | 625 | On |
4609 | 611 | And |
19750 | 581 | Filed |
16111 | 575 | Dominica |
7986 | 513 | At |
37893 | 493 | “It |
20807 | 473 | For |
80287 | 452 | These |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV